Goto

Collaborating Authors

 Adaptive Systems


Review for NeurIPS paper: Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs

Neural Information Processing Systems

Summary and Contributions: Post-rebuttal: I would like to thank the authors for their response. As stated in the original review, I think comparing to DQN will improve the paper. This paper address the problem of robust control of continuous dynamic systems, where the system's dynamics is unknown but assumed to have a linear structure, with external polytopic disturbance. The proposed approach consists of several steps for each action, first model and confidence region estimation (or refinement), then worst case reward extraction and state estimation bounds, a conservative planning step based on the reward and state bounds, finally one step execution, and repeating the process in an MPC like manner. The paper presents an end to end approach to the robust control problem for unknown dynamics (only the system dynamic matrix is unknown) in an adaptive manner.


Reviews: Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator

Neural Information Processing Systems

The central element of the paper is a (novel) algorithm that utilizes a convex optimization approach (the so-called System Level Synthesis approach, SLS) for synthesizing LQR controllers using estimated dynamics models. The SLS approach allows for an analysis of how the error in the matrix estimation affects the regret of the LQR controller. Using this controller synthesis, upper bounds on the estimation error of the dynamics matrices as well as upper and lower bounds for the expected loss are provided. The method is compared to existing approaches on a benchmark system. This computational study shows a comparable performance of all methods, with the presented method giving the nicest theoretical guarantees (e.g.


Adaptive control of reaction-diffusion PDEs via neural operator-approximated gain kernels

arXiv.org Artificial Intelligence

Neural operator approximations of the gain kernels in PDE backstepping has emerged as a viable method for implementing controllers in real time. With such an approach, one approximates the gain kernel, which maps the plant coefficient into the solution of a PDE, with a neural operator. It is in adaptive control that the benefit of the neural operator is realized, as the kernel PDE solution needs to be computed online, for every updated estimate of the plant coefficient. We extend the neural operator methodology from adaptive control of a hyperbolic PDE to adaptive control of a benchmark parabolic PDE (a reaction-diffusion equation with a spatially-varying and unknown reaction coefficient). We prove global stability and asymptotic regulation of the plant state for a Lyapunov design of parameter adaptation. The key technical challenge of the result is handling the 2D nature of the gain kernels and proving that the target system with two distinct sources of perturbation terms, due to the parameter estimation error and due to the neural approximation error, is Lyapunov stable. To verify our theoretical result, we present simulations achieving calculation speedups up to 45x relative to the traditional finite difference solvers for every timestep in the simulation trajectory.


Synchronisation-Oriented Design Approach for Adaptive Control

arXiv.org Artificial Intelligence

This study presents a synchronisation-oriented perspective towards adaptive control which views model-referenced adaptation as synchronisation between actual and virtual dynamic systems. In the context of adaptation, model reference adaptive control methods make the state response of the actual plant follow a reference model. In the context of synchronisation, consensus methods involving diffusive coupling induce a collective behaviour across multiple agents. We draw from the understanding about the two time-scale nature of synchronisation motivated by the study of blended dynamics. The synchronisation-oriented approach consists in the design of a coupling input to achieve desired closed-loop error dynamics followed by the input allocation process to shape the collective behaviour. We suggest that synchronisation can be a reasonable design principle allowing a more holistic and systematic approach to the design of adaptive control systems for improved transient characteristics. Most notably, the proposed approach enables not only constructive derivation but also substantial generalisation of the previously developed closed-loop reference model adaptive control method. Practical significance of the proposed generalisation lies at the capability to improve the transient response characteristics and mitigate the unwanted peaking phenomenon at the same time.


Adaptive Control of Euler-Lagrange Systems under Time-varying State Constraints without a Priori Bounded Uncertainty

arXiv.org Artificial Intelligence

In this article, a novel adaptive controller is designed for Euler-Lagrangian systems under predefined time-varying state constraints. The proposed controller could achieve this objective without a priori knowledge of system parameters and, crucially, of state-dependent uncertainties. The closed-loop stability is verified using the Lyapunov method, while the overall efficacy of the proposed scheme is verified using a simulated robotic arm compared to the state of the art.


A Neural Net Model for Adaptive Control of Saccadic Accuracy by Primate Cerebellum and Brainstem

Neural Information Processing Systems

Accurate saccades require interaction between brainstem circuitry and the cerebeJJum. A model of this interaction is described, based on Kawato's principle of feedback-error-Iearning. In the model a part of the brainstem (the superior colliculus) acts as a simple feedback controJJer with no knowledge of initial eye position, and provides an error signal for the cerebeJJum to correct for eye-muscle nonIinearities. This teaches the cerebeJJum, modelled as a CMAC, to adjust appropriately the gain on the brainstem burst-generator's internal feedback loop and so alter the size of burst sent to the motoneurons. With direction-only errors the system rapidly learns to make accurate horizontal eye movements from any starting position, and adapts realistically to subsequent simulated eye-muscle weakening or displacement of the saccadic target.


Fast, Robust Adaptive Control by Learning only Forward Models

Neural Information Processing Systems

A large class of motor control tasks requires that on each cycle the con(cid:173) troller is told its current state and must choose an action to achieve a specified, state-dependent, goal behaviour. This paper argues that the optimization of learning rate, the number of experimental control deci(cid:173) sions before adequate performance is obtained, and robustness is of prime importance-if necessary at the expense of computation per control cy(cid:173) cle and memory requirement. This is motivated by the observation that a robot which requires two thousand learning steps to achieve adequate performance, or a robot which occasionally gets stuck while learning, will always be undesirable, whereas moderate computational expense can be accommodated by increasingly powerful computer hardware. It is not un(cid:173) reasonable to assume the existence of inexpensive 100 Mflop controllers within a few years and so even processes with control cycles in the low tens of milliseconds will have millions of machine instructions in which to make their decisions. This paper outlines a learning control scheme which aims to make effective use of such computational power.


Identification and Adaptive Control of Markov Jump Systems: Sample Complexity and Regret Bounds

arXiv.org Machine Learning

Learning how to effectively control unknown dynamical systems is crucial for intelligent autonomous systems. This task becomes a significant challenge when the underlying dynamics are changing with time. Motivated by this challenge, this paper considers the problem of controlling an unknown Markov jump linear system (MJS) to optimize a quadratic objective. By taking a model-based perspective, we consider identification-based adaptive control for MJSs. We first provide a system identification algorithm for MJS to learn the dynamics in each mode as well as the Markov transition matrix, underlying the evolution of the mode switches, from a single trajectory of the system states, inputs, and modes. Through mixing-time arguments, sample complexity of this algorithm is shown to be $\mathcal{O}(1/\sqrt{T})$. We then propose an adaptive control scheme that performs system identification together with certainty equivalent control to adapt the controllers in an episodic fashion. Combining our sample complexity results with recent perturbation results for certainty equivalent control, we prove that when the episode lengths are appropriately chosen, the proposed adaptive control scheme achieves $\mathcal{O}(\sqrt{T})$ regret, which can be improved to $\mathcal{O}(polylog(T))$ with partial knowledge of the system. Our proof strategy introduces innovations to handle Markovian jumps and a weaker notion of stability common in MJSs. Our analysis provides insights into system theoretic quantities that affect learning accuracy and control performance. Numerical simulations are presented to further reinforce these insights.


Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control

arXiv.org Artificial Intelligence

In this paper we aim to provide analysis and insights (often based on visualization), which explain the beneficial effects of on-line decision making on top of off-line training. In particular, through a unifying abstract mathematical framework, we show that the principal AlphaZero/TD-Gammon ideas of approximation in value space and rollout apply very broadly to deterministic and stochastic optimal control problems, involving both discrete and continuous search spaces. Moreover, these ideas can be effectively integrated with other important methodologies such as model predictive control, adaptive control, decentralized control, discrete and Bayesian optimization, neural network-based value and policy approximations, and heuristic algorithms for discrete optimization.


Safe and Efficient Model-free Adaptive Control via Bayesian Optimization

arXiv.org Artificial Intelligence

Adaptive control approaches yield high-performance controllers when a precise system model or suitable parametrizations of the controller are available. Existing data-driven approaches for adaptive control mostly augment standard model-based methods with additional information about uncertainties in the dynamics or about disturbances. In this work, we propose a purely data-driven, model-free approach for adaptive control. Tuning low-level controllers based solely on system data raises concerns on the underlying algorithm safety and computational performance. Thus, our approach builds on GoOSE, an algorithm for safe and sample-efficient Bayesian optimization. We introduce several computational and algorithmic modifications in GoOSE that enable its practical use on a rotational motion system. We numerically demonstrate for several types of disturbances that our approach is sample efficient, outperforms constrained Bayesian optimization in terms of safety, and achieves the performance optima computed by grid evaluation. We further demonstrate the proposed adaptive control approach experimentally on a rotational motion system.